Characterization of promoter elements of isoprene‐responsive genes and the ability of isoprene to bind START domain transcription factors

Abstract Isoprene has recently been proposed to be a signaling molecule that can enhance tolerance of both biotic and abiotic stress. Not all plants make isoprene, but all plants tested to date respond to isoprene. We hypothesized that isoprene interacts with existing signaling pathways rather than requiring novel mechanisms for its effect on plants. We analyzed the cis‐regulatory elements (CREs) in promoters of isoprene‐responsive genes and the corresponding transcription factors binding these promoter elements to obtain clues about the transcription factors and other proteins involved in isoprene signaling. Promoter regions of isoprene‐responsive genes were characterized using the Arabidopsis cis‐regulatory element database. CREs bind ARR1, Dof, DPBF, bHLH112, GATA factors, GT‐1, MYB, and WRKY transcription factors, and light‐responsive elements were overrepresented in promoters of isoprene‐responsive genes; CBF‐, HSF‐, WUS‐binding motifs were underrepresented. Transcription factors corresponding to CREs overrepresented in promoters of isoprene‐responsive genes were mainly those important for stress responses: drought‐, salt/osmotic‐, oxidative‐, herbivory/wounding and pathogen‐stress. More than half of the isoprene‐responsive genes contained at least one binding site for TFs of the class IV (homeodomain leucine zipper) HD‐ZIP family, such as GL2, ATML1, PDF2, HDG11, ATHB17. While the HD‐zipper‐loop‐zipper (ZLZ) domain binds to the L1 box of the promoter region, a special domain called the steroidogenic acute regulatory protein‐related lipid transfer, or START domain, can bind ligands such as fatty acids (e.g., linolenic and linoleic acid). We tested whether isoprene might bind in such a START domain. Molecular simulations and modeling to test interactions between isoprene and a class IV HD‐ZIP family START‐domain‐containing protein were carried out. Without membrane penetration by the HDG11 START domain, isoprene within the lipid bilayer was inaccessible to this domain, preventing protein interactions with membrane bound isoprene. The cross‐talk between isoprene‐mediated signaling and other growth regulator and stress signaling pathways, in terms of common CREs and transcription factors could enhance the stability of the isoprene emission trait when it evolves in a plant but so far it has not been possible to say what how isoprene is sensed to initiate signaling responses.

between atom pairs within 16 Å distance. Long range electrostatic interactions were calculated through the particle mesh Ewald (PME) (Essmann et al., 1995) method using a 1 Å grid spacing. Temperature and pressure were controlled by Langevin thermo-and barostats (Grest & Kremer, 1986;Feller et al., 1995) to the respective replica temperature and pressure described by the Antoine-equation (A=8.14019, B=1810.94°C, C=244.485°C, result in Torr (Onken et al., 1989) multiplied by three. The higher pressures prevent water from vaporizing at higher replica temperatures, while liquid water densities do not significantly change with pressure. All atoms were coupled to the temperature bath with a coupling constant of 10 ps -1 . The cell volume oscillation period was set to 200 fs assuming friction that reduced this oscillation with a 100 fs decay damping time. The replica exchange TIGER2h simulations (Kulke et al., 2018) utilized 24 temperature replicas spanning 310-450 K. The temperatures for the replicas were assigned logarithmically with base 10 equidistant over the temperature range (310 K, 315.06 K, 320.21 K, 325.44 K, 330.76 K, 336.16 K, 341.65 K, 347.23 K, 352.9 K, 358.67 K, 364.53 K, 370.48 K, 376.54 K, 382.69 K, 388.94 K, 395.29 K, 401.75 K, 408.31 K, 414.98 K, 421.76 K, 428.65 K, 435.65 K, 442.77 K and 450 K). The TIGER2h method employs heating and cooling simulation cycles. After sampling the protein in explicit solvent for 16 ps, all replicas were cooled to 310 K within 4 ps. Exchanges were evaluated based on the Metropolis sampling criterion (Metropolis et al., 1953) for the protein conformations at 310 K in implicit solvent calling OpenMM 8.4.5 (Eastman et al., 2017). Afterwards, temperatures were reassigned and the next cycle continues.
The protein free energy landscape is constructed from the TIGER2h simulation trajectories by investigating the structure similarity between conformations with dihedral principal component analysis (dPCA) (Altis et al., 2007). dPCA, in contrast to typical principal component analysis in cartesian coordinate space, utilizes the rotationally and translationally invariant backbone dihedral angles ϕ and ψ to describe internal protein conformation. The first two principal components collective encode 25% of the variation seen across protein conformations within the TIGER2h simulations. The resulting twodimensional density is clustered with the density-based clustering algorithm DBSCAN (Ester et al., 1996) included in the scikit-learn library (Pedregosa et al., 2011). During the clustering, two sample points were considered neighboring up to a distance of 0.5 in the parameter space formed by the first two principal components and density clusters contained at least 600 sample points.

HDG11 binding studies to thylakoid membranes
The most likely protein conformations were tested for their ability to interact with biological membranes, where isoprene is expected to partition within a cell. The membrane builder interface on the CHARMM-GUI website (Jo et al., 2008;Wu et al., 2014) was used to generate multiple Arabidopsis thaliana thylakoid membrane models. Lipid head group and tail compositions were adjusted according to current lipidomics results, rounding percentages to whole lipid molecules (Table S1) (Block et al., 1983;Block et al., 2007;Fritz et al., 2007) such that each leaflet had 200 lipids. 60 membranes were prepared in total, varying only the initially randomized lipid starting positions. The concentration of isoprene in thylakoid membranes has been reported to be 0.0044 mol% under physiological conditions. In 30 of these membranes, an excess of 20-mol% isoprene was introduced by substituting water molecules for isoprene.
All membrane systems were equilibrated for 200 ns before proceeding with the protein binding simulations.
The solvated replica exchange structure of HDG11's START domain was oriented above the equilibrated thylakoid membrane in 6 different rotations. The rotations resemble the six faces of an imaginary cube around the protein with either the x, y, z, -x, -y, or -z plane facing towards the membrane surface. To improve the sampling statistics, the positional bias in the initial system state was removed by replicating each protein rotation five times and assigning it one of the prepared membranes with unique lipid starting positions resulting in 30 positionally different simulations. All systems were simulated for 200 ns.
Intermolecular interaction distances were changed to consider interactions up to 12 Å with a switching function at 10 Å. The interaction pair list generation frequency was increased to every 100 steps.
Temperature and pressure were maintained using Langevin thermo- (Grest & Kremer, 1986) and barostats (Feller et al., 1995) to 298 K and 1 bar, respectively. The thermostat coupling constant was reduced to 1 ps -1 . The pressure was adjusted separately in xy and z dimension in a semiisotropic ensemble to account for the different compressibilities of lipids and water molecules. All other simulation parameters are kept identical to the replica exchange simulation.
Contacts and distances were analyzed in VMD 1.9.4 (Humphrey et al., 1996) with scripts calculating the interactions between protein residues and membrane components. An interaction between proteins and membrane components was defined as a sigmoid function of the distance d in Å between heavy atoms.
The function counts small distance interactions as 1 with a turning point at 4 Å, at which point interactions still count as fractional contacts to account for the fluid nature of some interactions. For computational efficiency, heavy atom pairs further than 6 Å apart were not included in the overall contact sum. These interactions are ensemble averaged over the trajectories for each amino acid.